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Abstract 

Pattern matching is an abstraction mechanism that can greatly sim- 
plify source code. We present functional-style pattern matching for 
C++ implemented as a library, called Mach7^ . All the patterns are 
user-definable, can be stored in variables, passed among functions, 
and allow the use of class hierarchies. As an example, we imple- 
ment common patterns used in functional languages. 

Our approach to pattern matching is based on compile-time 
composition of pattern objects through concepts. This is superior 
(in terms of performance and expressiveness) to approaches based 
on run-time composition of polymorphic pattern objects. In partic- 
ular, our solution allows mapping functional code based on pattern 
matching directly into C++ and produces code that is only a few 
percent slower than hand-optimized C++ code. 

The library uses an efficient type switch construct, further ex- 
tending it to multiple scrutinees and general patterns. We compare 
the performance of pattern matching to that of double dispatch and 
open multi-methods in C++. 

Categories and Subject Descriptors D.1.5 [Programming tech- 
niques]: Object-oriented Progranmiing; D.3.3 [Programming Lan- 
guages]: Language Constructs and Features 

General Terms Languages, Design 

Keywords Pattern Matching, C++ 

1. Introduction 

Pattern matching is an abstraction mechanism popularized by 
the functional programming community, most notably ML [12], 
OCaml [21], and Haskell [15], and recently adopted by several 
multi-paradigm and object-oriented progranmiing languages such 
as Scala [30], F# [7], and dialects of C++[22, 29]. The expressive 
power of pattern matching has been cited as the number one reason 
for choosing a functional language for a task [6, 25, 28]. 

This paper presents functional-style pattern matching for C++. 
To allow experimentation and to be able to use production-quality 
toolchains (in particular, compilers and optimizers), we imple- 
mented our matching facilities as a C++ library. 



The library is available at http : //parasol . tamu . edu/mach7/ 
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1.1 Summary 

We present functional-style pattern matching for C++ built as an 
ISO C++1 1 library. Our solution: 

• is open to the introduction of new patterns into the library, while 
not making any assumptions about existing ones. 

• is type safe: inappropriate applications of patterns to subjects 
are compile-time errors. 

• Makes patterns first-class citizens in the language (§3.1). 

• is non-intrusive, so that it can be retroactively applied to exist- 
ing types (§3.2). 

• provides a unified syntax for various encodings of extensible 
hierarchical datatypes in C++. 

• provides an alternative interpretation of the controversial n+k 
patterns (in fine with that of constructor patterns), leaving the 
choice of exact semantics to the user (§3.3). 

• supports a limited form of views (§3.4). 

• generalizes open type switch to multiple scrutinees and enables 
patterns in case clauses (§3.5). 

• demonstrates that compile-time composition of patterns through 
concepts is superior to run-time composition of patterns through 
polymorphic interfaces in terms of performance, expressive- 
ness, and static type checking (§4.1). 

Our library sets a standard for the performance, extensibility, 
brevity, clarity, and usefulness of any language solution for pat- 
tern matching. It provides full functionality, so we can experiment 
with the use of pattern matching in C++ and compare it to existing 
alternatives. Our solution requires only current support of C++ 11 
without any additional tool support. 

2. Pattern IMatching in C++ 

The object analyzed through pattern matching is commonly called 
the scrutinee or subject, while its static type is commonly called 
the subject type. Consider for example the following definition of 
factorial in MachJ: 

int factorial(int n) { 
unsigned short m; 
Match{n) { 

Case(O) return 1; 

Case(m) return m*factorial(m-l); 

Case(_) throw std::invalid_argument( "/acton'a/"); 
} Endl\^atch 

} 

The subject n is passed as an argument to the Match statement 
and is then analyzed through Case clauses that list various pat- 
terns. In the first-fit strategy typically adopted by functional lan- 
guages, the matching proceeds in sequential order while the pat- 
terns guarding their respective clauses are rejected. Eventually, the 
statement guarded by the first accepted pattern is executed or the 
control reaches the end of the l\/Iatch statement. 



The value 0 in the first case clause is an example of a value pat- 
tern. It will match only when the subject n is 0. The variable m in 
the second case clause is an example of a variable pattern. It will 
bind to any value that can be represented by its type. The name _ in 
the last case clause refers to the common instance of the wildcard 
pattern. Value, variable, and wildcard patterns are typically referred 
to as primitive patterns. The list of primitive patterns is often ex- 
tended with a predicate pattern (e.g. as seen in Scheme [49]), which 
allows the use of any unary predicate or nuUary member-predicate 
as a pattern: e.g. Case(even) . . . (assuming bool even(int);) or 
Case([](int m) { return m"m-l; }) . . . for A-expressions. 

The predicate pattern is a use of a predicate as a pattern and 
should not be confused with a guard, which is a predicate attached 
to a pattern that may make use of the variables bound in it. The 
result of the guard's evaluation will determine whether the case 
clause and the body associated with it will be accepted or rejected. 
Guards gives rise to guard patterns, which in Mach7 are expres- 
sions of the form P\=E, where P is a pattern and E is its guard. 

Pattern matching is closely related to algebraic data types. 
In ML and Haskell, an Algebraic Data Type is a data type each 
of whose values are picked from a disjoint sum of data types, 
called variants. Each variant is a product type marked with a 
unique symbolic constant called a constructor. Each construc- 
tor provides a convenient way of creating a value of its vari- 
ant type as well as discriminating among variants through pat- 
tern matching. In particular, given an algebraic data type D = 
Ci (Til , .. . , Timi )\--\Ck{Tk'L,:; Tkmk ) an expression of the form 
Ci{xi, ...,Xmi) in a non-pattem-matching context is calledava/we 
constructor and refers to a value of type D created via the construc- 
tor Ci and its arguments xi, ...,Xmi. The same expression in the 
pattern-matching context is called a constructor pattern and is used 
to check whether the subject is of type D and was created with the 
constructor d. If so, it matches the actual values it was constructed 
with against the nested patterns Xj. 

C++ does not directly support algebraic data types. However, 
such types can be encoded in the language in a number of ways. 
Common object-oriented encodings employ an abstract class to 
represent the algebraic data Xypt and derived classes to represent 
variants. Consider for example the following representation of the 
terms of the A-calculus in C++: 

struct Term { virtual ~Term() {} }; 

struct Var : Term { std:: string name; }; 

struct Abs : Term { Var& var; Jerm&i body; }; 

struct App : Term { Term& func; Term& arg; }; 

C++ allows a class to have several constructors, but it does not 
allow overloading the meaning of construction for use in pat- 
tern matching. This is why in Mach7 we have to be slightly 
more explicit about constructor patterns, which take the form 

Q{Ti){Pi, ...,Pmi), where Ti is the name of the user-defined 
type we are decomposing and Pi,...,Pmi are patterns that will 
be matched against members of Ti. 'C was chosen to abbreviate 
"Constructor pattern" or "Case class" as its use resembles the use 
of case classes in Scala [30]. For example, we can write a com- 
plete (with the exception of bindings discussed in §3.2) recursive 
implementation of testing for the equality of two lambda terms as: 

bool operator==(const Term& left, const Term& right) { 
var(const std:: string &) s; var(const Term&) x,y; 
/Watc/i(left , right ) { 

Case(C(Var>(s) , C{Var)(+s) ) return true; 

Case(C(Abs>(x,y), C{Abs)(+x-,+j/)) return true; 

Case(C(App)(x-,y), C{App)(-|-a;,-|-2/)) return true; 

Otherwise{) return false ; 

} EndMatch 

} 



This == is an example of a binary method: an operation that re- 
quires both arguments to have the same type [3]. In each of the 
case clauses, we check that both subjects are of the same dynamic 
type using a constructor pattern. We then decompose both subjects 
into components and compare them for equality with the help of a 
variable pattern and an equivalence combinator ('+') applied to it. 
The use of an equivalence combinator turns a binding use of a vari- 
able pattern into a non-binding use of that variable's current value 
as a value pattern. We chose to overload unary + because in C++ it 
turns an 1- value into an r-value, which has a similar semantics here. 

In general, a pattern combinator is an operation on patterns 
to produce a new pattern. Other typical pattern combinators, sup- 
ported by many languages, are conjunction, disjunction and nega- 
tion combinators, which all have an intuitive Boolean interpreta- 
tion. We add a few non-standard combinators to Mach7 that reflect 
the specifics of C++, e.g. the presence of pointers and references. 

The equality operator on A-terms demonstrates both nesting of 
patterns and relational matching. The variable pattern was nested 
within an equivalence pattern, which in turn was nested inside 
a constructor pattern. The matching was also relational because 
we could relate the state of two subjects. Both aspects are even 
better demonstrated in the following well-known functional so- 
lution to balancing red-black trees with pattern matching due to 
Chris Okasaki [32, §3.3] implemented in MachJ: 

class T{enum color{black,red} col; T* left; K key; T* right;}; 

T* balance(T::color cir, T* left, const K&i key, T* right) { 
const T::color B = T::black, R = T::red; 
var{T*) o, b, c, d; var(K&) x, y, z; T::color col; 
Match{c\r, left, key, right) { 

Case(B, C<T)(R, C{T>(R, a, x, b), y, c), z, d) ... 

Case{B. C(T)(R, a, x, C(T>(R, b, y. c)), z.d) ... 

Case(B, a, x, C(T)(R, C(T)(R, h, y, c), z, d)) . . . 

Case{B, a, x, C(T)(R, 6, y, C(T)(R, c, z, d))) ... 

Case(col, a, x, b) return new T{col, a, x, b}\ 
} EndMatch 

} 

The ... in the first four case clauses above stands for 
return new T{R, new T{B,a,a;,6}, y, new T{B,c,a,d}};. 

To demonstrate the openness of the library, we implemented 
numerous specialized patterns that often appear in practice and 
are even built into some languages. For example, the following 
combination of regular-expression and one-of patterns can be used 
to recognize a toll-free phone number. 

rex('Y/'0-97+j-(/0-97+j-fA0-97+)",any({800,888,877}),n,m) 

The regular-expression pattern takes a C++11 regular expression 
and an arbitrary number of sub-patterns. It uses matching groups to 
match against the sub-patterns. A one-of pattern takes an initializer 
list with a set of values and checks that the subject matches at least 
one of them. The variables n and m are integers, and the values 
of the last two parts of the pattern will be assigned to them. The 
parsing is generic and will work with any data type that can be read 
from an input stream; this is a common idiom in C++. Should we 
also need the exact area code, we can mix in a variable pattern using 
the conjunction combinator: a any(. . .). 

3. Implementation 

The traditional object-oriented approach to implementing first- 
class patterns is based on run-time compositions through inter- 
faces. This ''patterns as objects" approach has been explored in 
several different languages [11, 14, 34, 47]. Implementations differ 
in where bindings are stored and what is returned as a result, but in 
its most basic form it consists of the pattern interface with a virtual 



function match that accepts a subject and returns whether it was 
accepted or rejected. This approach is open to new patterns and 
pattern combinators, but a mismatch in the type of the subject and 
the type accepted by the pattern can only be detected at run-time. 
Furthermore, it implies significant run-time overhead (§4.1). 

3.1 Patterns as Expression Templates 

Patterns in MachJ are also represented as objects; however, they are 
composed at compile time, based on C++ concepts. Concept is the 
C++ community's long-established term for a set of requirements 
for template parameters. Concepts were not included in C++11, 
but techniques for emulating them with enable_if [18] have been in 
use for a while. enable_if provides the ability to include or exclude 
certain class or function declarations from the compiler's consid- 
eration based on conditions defined by arbitrary metafunctions. 
To avoid the verbosity of enable_if, in tfiis work we use the no- 
tation for template constraints - a simpler version of concepts [42]. 
The Mach7 implementation emulates these constraints. 

There are two main constraints on which the entire Ubrary is 
built: Pattern and LazyExpression. 

template (typename P) constexpr bool Pattern() { 
return Copyable(P) // P must also be Copyable 

is_pattern(P)::value // this is a semantic constraint 
&i8j. requires (typename S, P p, S s) {// syntactic reqs: 
bool = { p(s) }; // usable as a predicate on S 
AcceptedType{P,S); // has this type function 

}; } 

The Pattern constraint is the analog of the pattern interface from 
the patterns as objects solution. Objects of any class P satisfying 
this constraint are patterns and can be composed with any other 
patterns in the library as well as be used in the Match statement. 

Patterns can be passed as arguments of a function, so they must 
be Copyable. Implementation of pattern combinators requires the 
library to overload certain operators on all the types satisfying the 
Pattern constraint. To avoid overloading these operators for types 
that satisfy the requirements accidentally, the Pattern constraint is 
a semantic constraint, which means that classes claiming to satisfy 
it have to state that expUcitly by specializing the is_pattern{P) 
trait. The constraint also introduces some syntactic requirements, 
described by the requires clause. In particular, because patterns 
are predicates on their subject type, they require presence of an 
application operator that checks whether a pattern matches a given 
subject. Unlike the patterns as objects approach, the Pattern 
constraint does not impose any restrictions on the subject type S. 
Patterns like the wildcard pattern will leave the S type completely 
unrestricted, while other patterns may require it to satisfy certain 
constraints, model a given concept, inherit from a certain type, etc. 
The application operator will typically return a value of type bool 
indicating whether the pattern is accepted on a given subject or 
rejected. 

Most of the patterns are applicable only to subjects of a given 
expected type or types convertible to it. This is the case, for ex- 
ample, with value and variable patterns, where the expected type 
is the type of the underlying value, as well as with the construc- 
tor pattern, where the expected type is the type being decomposed. 
Some patterns, however, do not have a single expected type and 
may work with subjects of many unrelated types. A wildcard pat- 
tern, for example, can accept values of any type without involv- 
ing a conversion. To account for this, the Pattern constraint re- 
quires the presence of a type alias AcceptedType, which given a 
pattern of type P and a subject of type S returns an expected type 
AcceptedType(P,S) that will accept subjects of type S with no or 
a minimum of conversions. By default, the alias is defined in terms 
of a nested type function accepted_type_for, as follows: 



template{typename P, typename S) 
using AcceptedType = P::accepted_type_for(S)::type; 

The wildcard pattern defines accepted_type_for to be an identity 
function, while variable and value patterns define it to be their 
underlying type. The constructor pattern's accepted type is the type 
it decomposes, which is typically different from the subject type. 
Mach7 employs an efficient type switch [41] under the hood to 
convert subject type to accepted type. 

Guards, n+k patterns, the equivalence combinator, and po- 
tentially some new user-defined patterns depend on capturing 
the structure (term) of lazily-evaluated expressions. All such 
expressions are objects of some type E that must satisfy the 
LazyExpression constraint: 

template (typename E) constexpr bool LazyExpression() { 
return Copyable(E) // E must also be Copyable 

is_expression(E}::value // this is semantic constraint 
requires (E e) { // syntactic requirements: 
ResultType(E); // associated result_type 

ResultType(E) == { eval(e) };// eval(E)^ result_type 
ResultType(E) { e }; // conversion to result_type 

}; } 

template(typename E) using ResultType = E::result_type; 

The constraint is, again, semantic, and the classes claiming to 
satisfy it must assert it through the is_expression(E) trait. The 
template alias ResultType(E) is defined to return the expres- 
sion's associated type result-type, which defines the type of the 
result of a lazily-evaluated expression. Any class satisfying the 
LazyExpression constraint must also provide an implementation 
of the function eval that evaluates the result of the expression. Con- 
version to the result_type should call eval on the object in order 
to allow the use of lazily-evaluated expressions in the contexts 
where their eagerly-evaluated value is expected, e.g. a non-pattem- 
matching context of the right-hand side of the Case clause. 

Our implementation of the variable pattern var(T) satisfies the 
Pattern and LazyExpression constraints as follows: 

template (Regular T) struct var { 
template (typename) 

struct accepted _type_for { typedef T type; }; 
bool operator()(const T& t) const / / exact match 

{ m_value = t; return true; } 
template (Regular S) 

bool operator() (const S&i s) const with conversion 

{ m_value = s; return m_value == s; } 
typedef T result_type; , e when used in expression 
friend const result_typeiSd eval(const yar&i v) // eager eval 

{ return v.m_value; } 
operator result_type() const { return eval(*this); } 
mutable T m_value; // value bound during matching 

}; 

template( Regular T)struct is_pattern(var(T)):true_type{}; 
template( Regular T)struct is_expression(var(T)):true_type{}; 

For semantic or efficiency reasons a pattern may have several over- 
loads of the application operator. In the example, the first alter- 
native is used when no conversion is required; thus, the variable 
pattern is guaranteed to be accepted. The second may involve 
a (possibly-narrowing) conversion, which is why we check that 
the values compare as equal after assignment. Similarly, for type 
checking reasons, accepted_type_for may (and typically will) pro- 
vide several partial or full specializations to limit the set of accept- 
able subjects. For example, the address combinator can only be ap- 
plied to subjects of pointer types, so its implementation will report 
a compile-time error when applied to any non-pointer type. 



To capture the structure of an expression, the hbrary employs 
a commonly-used technique called "expression templates" [45, 
46]. In general, an expression template is an algebraic structure 
(E^, {/i, /2, ...}) defined over the set E;; = {r | r l= (} of all the 
types r modeling a given concept C,. The operations ft allow one to 
compose new types modeling the concept C, out of existing types. 
In this sense, the types of all lazy expressions in Machl stem from 
a set of a few (possibly-parameterized) basic types like var(T} and 
value{T) (which both model LazyExpression) by applying type 
functors Uke plus and minus to them. Every type in the resulting 
family then has a function eval defined on it that returns a value 
of the associated type result_type. Similarly, the types of all the 
patterns stem from a set of a few (possibly-parameterized) patterns 
like wildcard, var(T), value(T), C(T) etc. by applying to them pat- 
tern combinators such as conjunction, disjunction, equivalence, 
address etc. The user is allowed to extend both algebras with either 
basic expressions and patterns or with functors and combinators. 

The sets T,LazyExpre.saion and ^Pattern havc a Hou-empty 
intersection, which slightly compUcates matters. The basic types 
var(T) and value(T) belong to both of those sets, and so do some 
of the combinators, e.g. conjunction. Since we can only have 
one overloaded operator&cS^ for a given combination of argument 
types, we have to state conditionally whether the requirements of 
Pattern, LazyExpression, or both are satisfied in a given instan- 
tiation of conjunction{Ti,T2), depending on what combination of 
these concepts the argument types Ti and T2 model. Concepts, un- 
Uke interfaces, allow modeling such behavior without multiplying 
implementations or introducing dependencies. 

3.2 Structural Decomposition 

MachJ's constructor patterns C(T)(Pi P„) requires the library 

to know which member of class T should be used as the subject 
to Pi, which should be matched against P2, etc. In functional lan- 
guages supporting algebraic data types, such decomposition is un- 
ambiguous as each variant has only one constructor, which is thus 
also used as a deconstmctor [2, 13] to define the decomposition 
of that type through pattern matching. In C++, a class may have 
several constructors, so we must be explicit about a class' decom- 
position. We specify that by specializing the library template class 
bindings. Here are the definitions that are required in order to be 
able to decompose the lambda terms we introduced in §2: 

template()class bindings(Var){/Wem£>ers(Var::name);}; 
template()class bindings{Abs){Mem£>ers(Abs::var,Abs;:body);}; 
template()class bindings(App){Members(App::func,App::arg);}; 

The variadic macro Members simply expands each of its argu- 
ments into the following definition, demonstrated here on App: :f u nc: 

static decltype(iSdApp;:func) memberl(){return iSdApp;:func;} 

Each such function returns a pointer-to-member that should be 
bound in position i. The library applies them to the subject in order 
to obtain subjects for the sub-patterns Pi, ...,Pn- Note that binding 
definitions made this way are non-intrusive since the original class 
definition is not touched. The binding definitions also respect en- 
capsulation since only the public members of the target type will 
be accessible from within a specialization of bindings. Members 
do not have to be data members only, which can be inaccessible, 
but any of the following three categories: 

• a data member of the target type T 

• a nuUary member function of the target type T 

• a unary external function taking the target type T by pointer, 
reference, or value. 

Unfortunately, C++ does not yet provide sufficient compile-time 
introspection capabilities to let the library generate bindings im- 
plicitly. These bindings, however, only need to be written once for 



a given class hierarchy (e.g. by its designer) and can be reused ev- 
erywhere. This is also true for parameterized classes (§3.4). 

3.3 Algebraic Decomposition 

Traditional approaches to generalizing n+k patterns treat match- 
ing a pattern f{x,y) against a value r as solving an equation 
f{x, y) = r [33]. This interpretation is well-defined when there 
are zero or one solutions, but alternative interpretations are possi- 
ble when there are multiple solutions. Instead of discussing which 
interpretation is the most general or appropriate, we look at n+k 
patterns as a notational decomposition of mathematical objects. 
The elements of the notation are associated with sub-components 
of the matched mathematical entity, which effectively lets us de- 
compose it into parts. The structure of the expression tree used in 
the notation is an analog of a constructor symbol in structural de- 
composition, while its leaves are placeholders for parameters to 
be matched against or inferred from the mathematical object in 
question. In essence, algebraic decomposition is to mathematical 
objects what structural decomposition is to algebraic data types. 
While the analogy is somewhat ad-hoc, it resembles the situation 
with operator overloading: you do not strictly need it, but it is so 
convenient it is virtually impossible not to have it. We demonstrate 
this alternative interpretation of the n+k patterns with examples. 

• An expression n/m is often used to decompose a rational num- 
ber into numerator and denominator. 

• An expression of the form 89 + r can be used to obtain the 
quotient and remainder of dividing by 3. When r is a constant, 
it can also be used to check membership in a congruence class. 

• The Euler notation a + bi, with i being the imaginary unit, is 
used to decompose a complex number into real and imaginary 
parts. Similarly, expressions r{cos(f> + isin(f>) and re"'' are used 
to decompose it into polar form. 

• A 2D fine can be decomposed with the slope-intercept form 
mX + c, the linear equation form aX + bY = c, or the two- 
points form {Y-yQ){xi -xo) = (j/i -yo){X -xo). 

• An object representing a polynomial can be decomposed for a 
specific degree: ao, aiX^ + 0,0, a2X^ + aiX^ + ao, etc. 

• An element of a vector space can be decomposed along some 
sub-spaces of interest. For example a 2D vector can be matched 
against (0,0), aX, bY, or aX + bY to separate the general case 
from cases when one or both components of the vector are 0. 

The expressions i, X, and Y in those examples are not variables, 
but rather are named constants of some dedicated type that allows 
the expression to be genericaUy decomposed into orthogonal parts. 

The linear equation and two-point forms for decomposing lines 
already include an equality sign, so it is hard to give them seman- 
tics in an equational approach. In our library that equality sign is 
not different from any other operator, like + or *, and is only used 
to capture the structure of the expression, while the exact semantics 
of matching against that expression is given by the user This flexi- 
bility allows us to generically encode many of the interesting cases 
of the equational approach. The following example, written with 
use of MachJ, defines a function for fast computation of Fibonacci 
numbers by using generalized n+k patterns: 

int fib(int n) { 
var{int) m; 
Match{n) { 
Case(any({l,2})) return 1; 

Case{2*m) return sqr(fib(m+l)) - sqr(fib(m-l)); 

Case{2*m-\-l) return sqr(fib(m+l)) -|- sqr(fib(m)); 
} EndMatch // sqr(x) = x*x 

} 

The MachV library already takes care of capturing the structure 
of lazy expressions (i.e. terms). To implement the semantics of 



their matching, the Mach7 user (i.e. the designer of a concrete 
notation) writes a new function overload to define the semantics 
of decomposing a value of a given type S against a term E: 

template (LazyExpression E, typename S) 
bool solve(const E&, const S&); 

The first argument of the function takes an expression template rep- 
resenting a term we are matching against, while the second argu- 
ment represents the expected result. Note that even though the first 
argument is passed in with the const qualifier, it may still modify 
state in E. For example, when E is var(T), the application operator 
for const-object that will eventually be called will update a mutable 
member m_value. The following example defines a generic solver 
for multiplication by a constant c * 0 of an expression e = ei * c. 

template {LazyExpression E, typename T) 

requires FiELD(E::result_type)() 
bool solve(const mult(E,value(T))&e, const E::result_type^r) 

{ return solve(e.m_ei,r/eval(e.m_e2)); } // e.m_e2 is c 
template (LazyExpression E, typename T) 

requires Integral( E: : result_type) () 
bool solve(const mult(E,value(T))<Sie, const E::result_type&ir){ 

T c = eval(e.m_e2); // e.?n_e2 is c 

return r%c == 0 solve(e.m_ei,r/c); 

} 

Intuitively, matching ei * c against the value r in the equational 
approach means solving ei * c = r, which means that we should try 
matching the sub-expression ei against ^. 

The first overload is only applicable when the result type of the 
sub-expression models the Field concept. In this case, we can rely 
on the presence of a unique inverse and simply call division without 
any additional checks. The second overload uses integer division, 
which does not guarantee the unique inverse, and thus we have 
to verify that the result is divisible by the constant first. This last 
overload combined with a similar solver for addition of integral 
types is everything the library needs to support the fib example. 

3.4 Views 

Any type T may have an arbitrary number of bindings, associated 
with it, which are specified by varying the second parameter of 
the bindings template: layout. The layout is a non-type template 
parameter of integral type; the layout parameter has a default value 
and is thus omitted most of the time. Our library's support of 
multiple bindings (through layouts) effectively enables a facility 
similar to Wadler's Viewi[48]. Consider: 

enum { cartesian = default_layout, polar }; ' ' Layouts 

template {class T) struct bindings(std::complex(T)) 

{ /Wemfaers(std::real(T>,std::imag(T)); }; 
template (class T) struct bindings(std::complex(T), polar) 

{ /Wembers(std::abs{T),std::arg{T)); }; 

template (class T) using Cart = view(std::complex(T)); 
template (class T) using Pole = view(std::complex(T), polar); 

std::complex(double) c; double a,b,r,f; 
Match{c) 

Case(Cart(double))(a,b)) ... // default layout 
Case(Pole(double))(r,f)) ... // view for polar layout 
EndMatch 

The C++ standard effectively forces the standard library to use 

the Cartesian representation [17, §26.4-4], which is why we chose 
the Cart layout as the default. We then define bindings for each 
layout and introduce template aliases (an analog of typedefs for 
parameterized classes) for each view. The Mach7 class view(T,l} 



binds a target type with one of that type's layouts. view(T,l) can be 
used everywhere the original target type T was expected. 

The important difference from Wadler's solution is that our 
views can only be used in a pattern-matching context, not as con- 
structors or as arguments to functions. 

3.5 Match Statement 

In functional languages with built-in pattern matching, relational 
matching on multiple subjects is usually reduced to nested match- 
ing on a single subject by wrapping multiple arguments into a tu- 
ple. In a library setting, we are able to provide a more efficient im- 
plementation if we keep the arguments separated. This is why our 
Match statement extends the efficient type switch for C++ [41] to 
handle multiple subjects (both polymorphic and non-pol3Tnorphic) 
(§3.5. 1) and to accept patterns in case clauses (§3.5.2). 

3.5.1 Multi-argument Type Switching 

The core of our efficient type switch [41] is based on the fact that 
virtual table pointers (vtbl-pointers) uniquely identify subobjects in 
the object and are perfect for hashing. Open type switch maps these 
vtbl-pointers to jump targets and necessary this-pointer offsets and 
provides an amortized constant-time dispatch to the appropriate 
case clause. Its efficiency relies on the optimal hash function H^i 
built for a set of vtbl-pointers V seen by a type switch. It is chosen 
by varying the parameters k and I to minimize the probability of 
conflict. The parameter k represents the logarithm of the size of 
cache, while the parameter / is the number of low bits to ignore. 

A Morton order (aka Z-order) is a function that maps multidi- 
mensional data to one dimension while preserving the locality of 
the data points [26]. A Morton number of an A^-dimensional coor- 
dinate point is obtained by interleaving the binary representations 
of all coordinates. The original one-dimensional hash function H^i 
applied to arguments v e V produced hash values in a tight range 
[0..2'=[ where k e[K,K+ 1] for 2^"^ < \V\ < 2^. The produced 
values were close to each other, which improved the cache hit rate 
due to increased locality of reference. The idea is thus to use Mor- 
ton order on these hash values - not on the original vtbl-pointers 

- in order to preserve locality of reference. To do this, we retain a 
single parameter k reflecting the size of the cache, but we keep N 
optimal offsets k for each argument i. 

Consider a set = {{vl,...,v'^),...,{vi,...,v^)} of iV- 
dimensional tuples representing the set of vtbl-pointer combina- 
tions coming through a given Match statement. As with the one- 
dimensional case, we restrict the size 2'' of the cache to be not 
larger than twice the closest power of two greater or equal to 
n = \V'^\: i.e. k e [K,K + 1], where 2^"' < \V'^\ < 2^. 
For a given k and offsets Ii,...,In a hash value of a given 
combination {v^,...,v'^) is defined as ((u^, ii'^)) = 

IN I 

YTF^ mod 2 , where the function /i returns the Morton 
number (bit interleaving) of N numbers. 

As in the one-dimensional case, we vary the parameters k,l\,...,lN 
in their finite and small domains to obtain an optimal hash function 

Hkii...i„ by minimizing the probability of conflict on values from 
T/^. Unlike the one-dimensional case, we do not try to find the 
optimal parameters every time we reconfigure the cache. Instead, 
we only try to improve the parameters to render fewer conflicts in 
comparison to the number of conflicts rendered by the current con- 
figuration. This does not prevent us from eventually converging to 
the same optimal parameters, which we do over time, but is impor- 
tant for holding constant the amortized complexity of the access. 
We demonstrate in §4.3 that - similarly to the one-dimensional case 

- such a hash function produces few collisions on real-world class 
hierarchies, and yet it is simple enough to compute that it competes 
well with alternatives that can cope with relational matching. 



3.5.2 Support for Patterns 

Given a statement Match(ei,. . .,eAr) applied to arbitrary expres- 
sions a, the library introduces several names into the scope of 
the statement: e.g. the number of arguments N, the subject types 
subject_typei (defined as decltype(ei) modulo type qualifiers), 
and the number of polymorphic arguments M. When M > 0 it 
also introduces the necessary data structures to implement efficient 
type switching [41]. Only the M arguments whose subject_typei 
are polymorphic will be used for fast type switching. 

For each case clause Case(pi pjy) the library ensures that 

the number of arguments to the case clause TV matches the number 
of arguments to the Match statement, and that the type Pj of every 
expression pi passed as its argument models the Pattern concept. 
For each subject_typei it introduces target_typei - the result of 
evaluating the type function AcceptedType{Pi,subject_typei) - 
into the scope of the case clause. This is the type the pattern expects 
as an argument on a subject of type subject.type^ (§3.1), which is 
used by the type switching mechanism to properly cast the subject 
if necessary. The library then introduces the names matchi of type 
target_typei& bound to properly casted subjects and available to 
the user in the right-hand side of the case clause in the event of 
a successful match. The quaUfiers appUed to the type of matchi 
reflect the qualifiers apphed to the type of the subject Cj. Finally, 
the library generates code that sequentially applies each pattern to 
properly-casted subjects, making the clause's body conditional: 

if (pi(matchi) ...8i8i pjv(matchjv)) { /* body */ } 

When type switching is not involved, the generated code imple- 
ments the naive backtracking strategy, which is known to be in- 
efficient as it can produce redundant computations [5, §5]. More- 
efficient algorithms for compiling pattern matching have been de- 
veloped since [1, 21, 23, 24, 37]. Unfortunately, while these al- 
gorithms cover most of the typical kinds of patterns, they are not 
pattern-agnostic as they make assumptions about the semantics of 
concrete patterns. A library-based approach to pattern matching is 
agnostic of the semantics of any given user-defined pattern. The in- 
teresting research question in this context would be: what language 
support is required to be able to optimize open patterns? 

The main advantage from using pattern matching in Mach7 
comes from the fast type switching weaved into the Match state- 
ment. It effectively skips case clauses that will definitely be rejected 
because their target type is not one of the subject's dynamic types. 
Of course, this is only applicable to polymorphic arguments; for 
non-polymorphic arguments, the matching is done naively with a 
cascade of conditional statements. 

4. Evaluation 

We performed several independent studies of our pattern match- 
ing solution to test its efficiency and impact on the compilation 
process. In the first study, we compare various functions written 
with pattern matching to functionally-equivalent manually-hand- 
optimized code in order to estimate the overhead added by the 
composition of patterns (§4.1). We demonstrate this overhead for 
both our solution and the patterns as objects approach. In the sec- 
ond study, we compare the impact on compilation times of both 
approaches (§4.2). In the third study, we looked at how well our 
extension of Match statement to A'^ arguments using the Morton 
order deals with large real-world class hierarchies (§4.3). In the 
fourth study, we compare the performance of matching N poly- 
morphic arguments against double, triple, and quadruple dispatch 
via visitor design pattern as well as open multi-methods extension 
to C++ (§4.4). In the last study, we rewrote the optimizer of an ex- 
perimental language from Haskell into C++. We compare the ease 
of use, readability, and maintainability of the original Haskell code 
and its Much? equivalent (§4.5). 



The studies involving performance comparisons have been per- 
formed on a Sony VAIO® laptop with Intel® Core™i5 460M CPU 
at 2.53 GHz, 6GB of RAM, and Windows 7 Professional. All the 
code was compiled with G++ (versions 4.5.2, 4.6.1, and 4.7.2, all 
run under MinGW with -02 and producing 32-bit x86 binaries) 
and Visual C++ (versions 10.0 and 11.0, both with profile-guided 
optimizations). 

To improve accuracy, timing was performed using the x86 
RDTSC instruction. For every number reported we ran 101 exper- 
iments timing 1 ,000,000 top-level calls each. (Depending on argu- 
ments, there may have been a different number of recursive calls). 
The first experiment served as a warm-up, and typically resulted in 
an outlier with the largest time. Averaged over 1,000,000 calls, the 
number of cycles per top-level call in each of the 101 experiments 
was sorted and the median was chosen. We preferred the median to 
the average to diminish the influence of other applications and OS 
interrupts as well as to improve reproducibihty of timings between 
the application runs. In particular, in the diagnostic boot mode of 
Windows 7, where the minimum of drivers and background appU- 
cations are loaded, we got the same number of cycles per iteration 
70-80 out of 101 times. Timings in non-diagnostic boots had some- 
what larger absolute values, but the relative performance remained 
unchanged and equally well-reproducible. 

4.1 Pattern Matching Overhead 

The overhead associated with pattern matching may come from: 

• Naive (sequential and often duplicated) order of tests due to a 
pure library solution. 

• The compiler's inability to inline the test expressed by the 
pattern in a case clause's left-hand side (e.g. due to lack of 
[type] information or due to the complexity of the expression). 

• The compiler's inability to elide construction of pattern trees 
when used in the right-hand side of a case clause. 

To estimate the overhead introduced by the commonly-used pat- 
terns as objects approach and our patterns as expression templates 
approach (§3.1), we implemented several simple functions, both 
with and without pattern matching. The handcrafted code we com- 
pared against was hand-optimized by us to render the same results, 
without changes to the underlying algorithm. Some functions were 
implemented in several ways with different patterns in order to 
show the impact on performance of different patterns and pattern 
combinations. The overhead of both approaches on a range of re- 
cent C++ compilers is shown in Figure 1. 
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Figure 1. Pattern Matching Overhead 



The experiments marked with * correspond to the functions in 
§2 and §3.3. The rest of the functions, including all the implemen- 
tations using the patterns as objects approach, are available on the 
project's web page. The patterns involved in each experiment are 
abbreviated as following: 1 - value pattern; v - variable pattern; _ - 
wildcard pattern; n+k - n+k (application) pattern; + - equivalence 
combinator; & - address combinator; C - constructor pattern. 

The overhead incurred by compile-time composition of patterns 
in the patterns as expression templates approach is significantly 



smaller than the overhead of run-time composition of patterns in 
the patterns as objects approach. In some cases, shown in the table 
in bold, the compiler was able to eliminate the overhead entirely. In 
the case of the "lambdas" experiment, the advantage was due to the 
underlying type switch, while in the other cases the generated code 
utilized the instruction pipeline and the branch predictor better. 

In each experiment, the handcrafted baseline implementation 
was the same in both cases (compile-time and run-time compo- 
sition) and reflected our idea of the fastest code without pattern 
matching describing the same algorithm. For example, gcds was 
implementing the fast Euclidian algorithm with remainders, while 
gcdi and gcd2 were implementing its slower version with subtrac- 
tions. The baseline code was correspondingly implementing fast 
Euclidian algorithm for gcda and slow for gcdi and gcd2. 

The comparison of the overhead incurred by both approaches 
would be incomplete without the details of our implementation of 
the patterns as objects solution. In particular, dealing with objects 
in object-oriented languages often involves heap allocation, sub- 
type tests, garbage collection, etc., which can all significantly affect 
performance. To make this comparison applicable to a wider range 
of object-oriented languages, we took the following precautions in 
the patterns as objects implementations: 

• All the objects involved were stack-allocated or statically allo- 
cated. This measure was taken to avoid allocating objects on 
the heap, which is known to be much slower. Many compilers 
of object-oriented languages perform the same optimization. 

• Objects representing constant values as well - as patterns whose 
state does not change during pattern matching (e.g. wildcard 
and value patterns) - were all statically allocated. 

• Patterns that modify their own state were constructed only when 
they were actually used, since a successful match by a previous 
pattern may return early from the function. 

• Only the arguments that were actually pattern-matched were 
boxed into the object class hierarchy; e.g. in the case of the 
power function only the second argument was boxed. 

• Boxed arguments were statically typed with their most de- 
rived type to avoid uimecessary type checks and conversions, 
e.g. object_of(int)i£i, which is a class derived from object and 
that represents a boxed integer, instead of just object^. 

• No objects were returned as a result of a function, as in truly 
object-oriented approach that might require heap allocation. 

• n+k patterns that effectively require evaluating the result of 
an expression were implemented with an additional virtual 
function that simply checks whether a result is a given value. 
This does not allow expressing all the n+k patterns of Mach7, 
but was sufficient to express all those involved in the experi- 
ments and allowed us to avoid heap-allocating the results. 

• When run-time type checks were unavoidable (e.g. inside the 
implementation of pattern:: match) we compared type IDs first, 
and only when the comparison failed we invoked the much 
slower dynamic_cast to optimize the common case. 

With these precautions in place, the main overhead of the pat- 
terns as objects solution was in the cost of a virtual function call 
(pattern:: match) and the cost of run-time type identification and 
conversion on its argument (the subject). Both are specific to the 
approach and not to our implementation, so similar overhead is 
present in other object-oriented languages following this strategy. 

4.2 Compilation Time Overhead 

Several people expressed concerns about a possible significant in- 
crease in compilation time due to the openness of our pattern- 
matching solution. While this might be the case for some patterns 
that require a lot of compile-time computations, it is not the case 
with any of the common patterns we implemented. Our patterns 
are simple top-down instantiations that rarely go beyond standard 



overload resolution or the occasional enable_if condition. Further- 
more, we compared the compilation time for each of the examples 
discussed in §4.1 with a handcrafted version. 
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Table 1. Compilation Time Overhead 



As can be seen in Table 1 , the difference in compilation times 
was small: on average, 3.99% slower for open patterns and 4.84% 
slower for patterns as objects, with patterns compiling faster in a 
few cases (indicated in bold). The difference will be less in real- 
world projects with a larger amount of non-pattem-matching code. 

4.3 Multi-argument Hashing 

To check the efficiency of hashing in the multi-argument Match 
statement (§3.5) we used the same class hierarchy benchmark we 
used to test the efficiency of hashing in type switch [41, §4.4]. 
The benchmark consists of 13 Ubraries describing 15,246 classes. 
Not all the class hierarchies originated from C++, but all were 
written by humans and represent their respective problem domains. 

While the Match statement works with both polymorphic and 
non-polymorphic arguments, only the polymorphic arguments are 
taken into consideration for efficient type switching and thus ef- 
ficient hashing. It also generally only makes sense to apply type 
switching to non-leaf nodes of the class hierarchy. 71% of the 
classes in the entire benchmark suite were leaf classes. For each 
of the remaining 4,369 non-leaf classes we created 4 functions, 
performing case analysis on derived classes with 1, 2, 3 and 4 argu- 
ments, respectively. Each of the functions was executed with differ- 
ent combinations of possible derived types, including, in the case 
of repeated multiple inheritance, different sub-objects within the 
same type. There were 63,963 different subobjects when the class 
hierarchies used repeated multiple inheritance and 38,856 different 
subobjects with virtual multiple inheritance. 

As with type switching, for each of the 4,369 functions (per 
same number of arguments) we measured the number of conflicts 
m in cache: the number of entries mapped to the same location 
in cache by the optimal hash function. We then computed the 
percentage of functions that achieved a given number of conflicts, 
shown in Figure 2. 
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Figure 2. Percentage of A'^-argument Match statements with given 
number of conflicts (m) in cache 

We grouped the results in ranges of exponentially-increasing 
size because wc noticed that the number of conflicts per Match 
statement for multiple argimients was not as tightly distributed 



around 0 as it was for a single argument. However, the main ob- 
servation still holds: in most of the cases, we could achieve hashing 
without conflicts, as can be seen in the first column (marked [0]). 
The numbers are slightly better when virtual inheritance is used 
because the overall number of possible subobjects is smaller. 

4.4 Comparison of Alternatives for Relational Matching 

Relational matching on classes depends on the efficient discovery 
of the sought-after combinations of dynamic types of the subjects. 
This can be performed in a number of different ways including, for 
example, the techniques used to implement multiple dispatch. We 
compare the efficiency of type switching on multiple arguments 
in comparison to other relational matching alternatives based on 
double, triple and quadruple dispatch [16], as well as our own 
implementation of open multi-methods for C++ [36]. 

The need for multiple dispatch rarely happens in practice, di- 
minishing with the number of arguments involved in dispatch. 
Muschevici et al [27] studied a large corpus of applications in 6 
languages and estimate that single dispatch amounts to about 30% 
of all the functions, while multiple dispatch is only used in 3% 
of functions. In application to type switching, this indicates that 
we can expect case analysis on the dynamic type of a single ar- 
gument much more often than on dynamic types of two or more 
arguments. However, this does not mean that pattern matching in 
general reflects the same trend, as additional arguments are often 
introduced into the Match statement to check some relational prop- 
erties. These additional arguments are typically non-polymorphic 
and thus do not participate in type switching, which is why in this 
experiment we only deal with polymorphic arguments. 

Figure 3 contains 4 bar groups corresponding to the number of 
arguments used for multiple dispatch. Each group contains 3 wide 
bars representing the number of CPU cycles per iteration it took the 
N-Dispatch, Open Type Switch and Open Multi-methods solutions 
to perform the same task. Each of the 3 wide bars is subsequently 
split into 5 narrow sub-bars representing performance achieved by 
G++ 4.5.2, 4.6.1, 4.7.2 and Visual C++ 10 and 11, in that order. 
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Figure 3. N-argument Match statement vs. visitor design pattern 
and open multi-methods 

Open multi-methods give the best performance because the dis- 
patch is implemented with an A^'-dimensional array lookup, requir- 
ing only 4A'' + 1 memory references before an indirect call. N- 
dispatch runs the slowest, requiring 2N virtual function calls (ac- 
cept/visit per each dimension). Open type switch falls between the 
two, thanks to its efficient hashing combined with a jump table. 

In terms of memory, given a class hierarchy of n classes (ac- 
tually n subobjects in the subobject graph) and multiple dispatch 
on A'^ arguments, all 3 solutions require memory proportional to 
O {n^)- More specifically, if S is the number of bytes used by a 
pointer, then each of the approaches will use: 



• Open Multi-methods: 5 (n^ + Nn + N) 

• N-Dispatch: S(n'^ + n^'^ + --- + n^ + n) 

• Open Type Switch: <5 ((27V + 3) + iV + 7) 

bytes of memory. In all 3 cases, the memory counted represents the 
non-reusable memory specific to the implementation of a single 
function dispatched through A'^ polymorphic arguments. Note that 
n is a variable here since new classes may be loaded at run-time 
through dynamic linking in all 3 solutions, while TV is a constant, 
representing the number of arguments to dispatch on. 

The memory used by each approach is allocated at different 
stages. The memory used by the virtual tables involved in the N- 
dispatch solution as well as the dispatch tables used by open multi- 
methods will be allocated at compile/link time and will be reflected 
in the size of the final executable. Open multi-methods might re- 
quire additional allocations and/or recomputation at load time to 
account for dynamic linking. In both cases, the memory allocated 
covers all possible combinations of n classes in TV argument posi- 
tions. In the case of open type switch, the memory is only allocated 
at run-time and grows proportionally to the number of actual argu- 
ment combinations seen by the type switch (§3.5.1). Only in the 
worst case, when all possible combinations have been seen by the 
type switch, does it reach the size described by the above formula. 
This is an important distinction, as in many applications many pos- 
sible combinations will never be seen: for example, in a compiler 
the entities representing expressions and types might all be derived 
from a common base class, but they will rarely appear in the same 
type switch together. 

There is also a significant difference in the ease of use of these 
solutions. N-dispatch is the most restrictive solution as it is intru- 
sive (and thus cannot be applied retroactively), hinders extensibil- 
ity (by limiting the set of distinguishable cases), and is surprisingly 
hard to teach students. While analyzing Java idioms used to emu- 
late multiple dispatch in practice, Muschevici et al [27, Figure 13] 
noted that there are significantly more uses of cascading instanceof 
in the real code than the uses of double dispatch, which they also 
attribute to the obscurity of the second idiom. Both N-dispatch 
and open multi-methods also introduce control inversion in which 
the case analysis is effectively structured in the form of callbacks. 
Open multi-methods are also subject to ambiguities, which have 
to be resolved at compile time and in some cases might require 
the addition of numerous overriders. Neither problem occurs with 
open type switch, where the case analysis is performed directly and 
ambiguities are avoided by the use of first-fit semantics. 

4.5 Rewriting Haskell Code in C++ 

For this experiment, we took existing code written in Haskell and 
asked its author to rewrite it in C++ with MachJ. The code in 
question is a simple peephole optimizer for an experimental GPU 
language called Versity. We assisted the author along the way to see 
which patterns he used and what kind of mistakes he made. 

Somewhat surprisingly to us, we found that the pattern-matching 
clauses generally became shorter, but their right-hand side became 
longer. The shortening of case clauses was perhaps specific to this 
application and mainly stemmed from the fact that Haskell does not 
support equivalence patterns or an equivalence combinator and had 
to use guards to relate different arguments. This was particularly 
cumbersome when the optimizer was looking at several arguments 
of several instructions in the stream, e.g.: 

peep2(xl:x2:xs) = 
case (xl,x2) of 

((InstMove a b),(lnstMove c d)) | (a==d)«d&(b==c) ^ ... 

compared to the functionally-equivalent Machl code: 

Matc/7(*xl,x-x2) { 



Case(C(lnstMove)(a,b), C{lnstMove)(+b,+a)) ... 

Haskell also requires the programmer to use a wildcard pattern in 

every unused position of a constructor pattern (e.g. InstBin ), 

while Mach7 allows the omission of all the traiUng wildcards 
(e.g. C(lnstBin)()). The use of named patterns avoided many re- 
peated expressions and improved performance and readabiUty: 

auto either =val(src) || val(dst); 
/Watc/7(inst) { 

Case(C(lnstMove)(_, either)) ... 

Case(C(lnstUn) (_, either)) ... 

Case(C( InstBin) (_, _, _, either)) ... 
} EndMatch 

Mach7 suffered a disadvantage in the code after the pattern match- 
ing, as we had to both explicitly manage memory when inserting, 
removing, or replacing instructions in the stream and explicitly 
manage the stream itself. Eventually we could hide some of this 
boilerplate behind smart pointers and other standard Ubrary classes. 

4.6 Limitations 

While our patterns can be saved in variables and passed to func- 
tions, they are not true first-class citizens as one cannot create a 
run-time data structure of patterns (e.g. a composition of patterns 
based on user input). This is similar to how polymorphic (template) 
functions are not considered first-class citizens in C++. This can po- 
tentially be solved by mixing in the patterns as objects approach, 
however the performance overhead we saw in §4.1 is too costly to 
be adopted. 

5. Related Work 

Language support for pattern matching was first introduced for 
string manipulation in COMIT [50], which subsequently inspired 
similar primitives in SNOBOL [10]. SNOBOL4 had string pat- 
terns as first-class data types, providing operations for concatena- 
tion and alternation. The first reference to modern pattern-matching 
constructs as seen in functional languages is usually attributed 
to Burstall's work on structural induction [4]. Pattern matching 
was further developed by the functional programming cormnu- 
nity, most notably ML [12] and Haskell [15]. In the context of 
object-oriented programming, pattern matching was first explored 
in Pizza [31] and Scala [9, 30]. The idea of first-class patterns dates 
back at least to Tullsen's proposal to add them to Haskell [44]. The 
calculus of such patterns has been studied in detail by Jay [19, 20]. 

There are two main approaches to compiling pattern-matching 
code: the first is based on backtracking automata and was intro- 
duced by Augustsson [1], and the second is based on decision trees 
and was first described by Cardelli [5]. The backtracking approach 
usually generates smaller code [21], whereas the decision tree ap- 
proach produces faster code by ensuring that each primitive test is 
only performed once [24], 

There have been several attempts to bring pattern matching into 
various languages by way of a library. They differ in which abstrac- 
tions of the host language were used to encode the patterns and the 
match statement. MatchO was one of the first such attempts for 
Java [47]. The approach follows the patterns as objects strategy. 
Functional C# was a similar approach, bringing pattern matching 
to C# as a library [34]. The approach uses lambda expressions and 
chaining of method calls to create a structure that is then evalu- 
ated at run time for the first successful match. In the functional 
community, Rhiger explored the introduction of first-class pattern 
matching into Haskell as a library [38], He uses functions to en- 
code patterns and pattern combinators, which allows him to detect 
pattern misapplication errors at compile time through the Haskell 
type system. Racket has a powerful macro system that allows it 



to express open pattern matching in the language entirely as a li- 
brary [43]. The solution is remarkable in that unlike most of the 
library approaches to open pattern matching, it does not rely on 
naive backtracking and, in fact, encodes the optimized algorithm 
based on backtracking automata [1, 21]. Grace is another program- 
ming language that provides a library solution to pattern matching 
through objects [14]. Similar to other control structures in the lan- 
guage, Grace encodes the match statement with partial functions 
and lambda expressions, while patterns are encoded as objects. 

Multiple language extensions have been developed to provide 
pattern matching into a host language in a form of a compiler, pre- 
processor or tool. Prop brought pattern matching and term rewrit- 
ing into C++ [22], It did not offer first-class patterns, but sup- 
ported most of the functional-style patterns and provided an opti- 
mizing compiler for both pattern matching and garbage-collected 
term rewriting. App was another pattern-matching extension to 
C++ [29] that mainly concentrated on providing syntax for defin- 
ing algebraic data types and pattern matching on them. Tom is a 
pattern-matching compiler that brings a common pattern-matching 
and term-rewriting syntax into Java, C, and Eiffel. Thanks to its 
distinct syntax, it is transparent to the semantics of the host lan- 
guage and can be implemented as a preprocessor to many other 
languages. Tom neither supports first-class patterns, nor is open 
to new patterns. Matchete is a language extension to Java that 
brings together different flavors of pattern matching: functional- 
style patterns, Perl-style regular expressions, XPath expressions, 
Erlang's bit-level patterns, etc. [13]. The extension does not try to 
make patterns first-class citizens, but instead concentrates on im- 
plementing existing best practices and their tight integration into 
Java. OOMatch is another Java extension; it brings pattern match- 
ing and multiple dispatch close together [39]. The approach gener- 
alizes multiple dispatch by offering to use patterns as multi-method 
arguments and then orders overriders based on the specificity of 
their arguments. Similar to other such systems, the approach only 
deals with a limited set of built-in patterns. 

Thorn is a dynamically-typed scripting language that provides 
first-class patterns [2]. The language defines a handful of atomic 
patterns and pattern combinators to compose them, and, similarly to 
Newspeak and Grace, uses the duality between partial functions 
and patterns to support user-defined patterns. 

When a class hierarchy is fixed, we can design a pattern lan- 
guage that involves semantic notions represented by the hierarchy. 
Pirkelbauer devised a pattern language for Pivot [8] capable of rep- 
resenting various entities in a C++ program using syntax very close 
to C++ itself The patterns were translated by a tool into a set of 
visitors implementing the pattern-matching semantics [35]. 

6. Conclusions and Future Work 

The Mach7 library provides functional-style pattern-matching fa- 
cilities for C++. The solution is open to new patterns, with the tra- 
ditional patterns implemented as an example. It is non-intrusive, 
so it can be applied retroactively. The library provides efficient 
and expressive matching on multiple subjects and compares well 
to multiple dispatch alternatives in terms of both time and space. 
We also offer an alternative interpretation of the n+k pattems and 
show how some traditional generalizations of these patterns can be 
implemented in our library. Mach? pattern matching code performs 
reasonably compared to open multi-methods and visitors, demon- 
strating the effectiveness of the library-based approach. 

The work presented here continues our research on pattern 
matching for C++ [41]. Due to page limit, we had to omit many 
interesting details that provide a better insight into our solution. We 
refer the reader to the first author's PhD thesis [40] for an in-depth 
discussion of open type switching, open pattern matching and open 
multi-methods in the context of C++. 



In the future, we would like to implement an actual language ex- 
tension that will be capable of working with open patterns. Given 
such an extension and its implementation, we would like to look 
into how code for such patterns can be optimized without hardcod- 
ing the knowledge of the semantics of the patterns into the com- 
piler. We would also like to experiment with other kinds of pat- 
terns (including those defined by the user), look at the interaction 
of patterns with both the standard library and other facilities in the 
language, and make views less ad-hoc. 
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